智能论文笔记

Noise-injected analog Ising machines enable ultrafast statistical sampling and machine learning

Fabian Böhm , Diego Alonso-Urquijo , Guy Verschaffelt , Guy Van der Sande

分类：机器学习

2021-12-21

ising机器是一个有前途的非von-neumann用于神经网络训练和组合优化的计算概念。然而，虽然可以用诸如展示机器实现各种神经网络，但是它们无法执行快速统计采样使得它们与数字计算机相比训练这些神经网络的效率低。在这里，我们通过注入模拟噪声来介绍一个通用概念，以实现具有ising机器的超快统计抽样。通过光电型机，我们证明这可用于精确采样Boltzmann分布和无监督的神经网络训练，具有与基于软件的培训等于准确性。通过模拟，我们发现ising机器可以比基于软件的方法更快地执行统计采样顺序。这使得Ising Machines成为机器学习的有效工具和超出组合优化的其他应用。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Certifying Fairness of Probabilistic Circuits

Nikil Roashan Selvam , Guy Van den Broeck , YooJung Choi

分类：机器学习 | 人工智能

2022-12-05

With the increased use of machine learning systems for decision making, questions about the fairness properties of such systems start to take center stage. Most existing work on algorithmic fairness assume complete observation of features at prediction time, as is the case for popular notions like statistical parity and equal opportunity. However, this is not sufficient for models that can make predictions with partial observation as we could miss patterns of bias and incorrectly certify a model to be fair. To address this, a recently introduced notion of fairness asks whether the model exhibits any discrimination pattern, in which an individual characterized by (partial) feature observations, receives vastly different decisions merely by disclosing one or more sensitive attributes such as gender and race. By explicitly accounting for partial observations, this provides a much more fine-grained notion of fairness. In this paper, we propose an algorithm to search for discrimination patterns in a general class of probabilistic models, namely probabilistic circuits. Previously, such algorithms were limited to naive Bayes classifiers which make strong independence assumptions; by contrast, probabilistic circuits provide a unifying framework for a wide range of tractable probabilistic models and can even be compiled from certain classes of Bayesian networks and probabilistic programs, making our method much more broadly applicable. Furthermore, for an unfair model, it may be useful to quickly find discrimination patterns and distill them for better interpretability. As such, we also propose a sampling-based approach to more efficiently mine discrimination patterns, and introduce new classes of patterns such as minimal, maximal, and Pareto optimal patterns that can effectively summarize exponentially many discrimination patterns

translated by 谷歌翻译

Improving language models by retrieving from trillions of tokens

Sebastian Borgeaud , Arthur Mensch , Jordan Hoffmann , Trevor Cai , Eliza Rutherford , Katie Millican , George van den Driessche , Jean-Baptiste Lespiau , Bogdan Damoc , Aidan Clark

分类：自然语言处理 | 机器学习

2021-12-08

我们通过与与前面令牌的局部相似度，通过调节从大语料库检索的文档块来增强自动回归语言模型。尽管使用25美元\时分，我们的检索增强型变压器（RetroCro）的检索增强型变压器（RetroCr）对GPT-3和侏罗纪-1获得了可比性的性能。微调后，复古表演转换为下游知识密集型任务，如问题应答。复古结合了冷冻BERT猎犬，一种可微分的编码器和块状的横向机制，以预测基于数量级的令牌，而不是训练期间通常消耗的数量。我们通常从头开始训练复古，还可以快速改造预先接受的变压器，通过检索，仍然达到良好的性能。我们的工作通过以前所未有的规模开辟了通过显式内存改进语言模型的新途径。

translated by 谷歌翻译

Lossless Compression with Probabilistic Circuits

Anji Liu , Stephan Mandt , Guy Van den Broeck

分类：机器学习

2021-11-23

尽管在图像生成方面广泛进展，但在应用于无损压缩时，深度生成模型是次优。例如，由于其潜在变量，诸如VAE的模型遭受压缩成本开销，其潜在的变量只能被部分地消除，这些方案诸如位编码，导致单个样本压缩率不良。为了克服这些问题，我们建立了一类新的易旧的无损压缩模型，允许有效的编码和解码：概率电路（PC）。这些是一类神经网络，涉及$ | $ COWS $ COMPUTIONATION单位，支持高效的$ D $特征尺寸的任意子集，从而实现有效的算术编码。我们推出了有效的编码和解码方案，即有时间复杂度$ \ mathcal {o}（\ log（d）\ cdot | p | p |）$，其中天真的方案在$ d $和$ | p | $ ，使方法高度可扩展。经验，我们的PC基（DE）压缩算法比实现类似比特率的神经压缩算法更快地运行5-20倍。通过缩放传统的PC结构学习管道，我们在诸如MNIST之类的图像数据集上实现了最先进的结果。此外，PC可以自然地与现有的神经压缩算法集成，以改善在自然图像数据集上的这些基础模型的性能。我们的结果突出了非标准学习架构可能对神经数据压缩的潜在影响。

translated by 谷歌翻译

Solving Marginal MAP Exactly by Probabilistic Circuit Transformations

YooJung Choi , Tal Friedman , Guy Van den Broeck

分类：人工智能 | 机器学习

2021-11-08

概率性电路（PC）是一类允许高效，通常是线性时间，诸如边缘的查询的验证和最可能的解释（MPE）的概率。然而，对于许多决策问题是核心的边缘地图仍然是对PC的硬质查询，除非它们满足高度限制性的结构约束。在本文中，我们开发了一种修剪算法，其删除与边缘地图查询无关的PC的部分，在保持正确的解决方案的同时缩小PC。这种修剪技术如此有效，我们能够完全基于迭代地改变电路构建边缘地图求解器 - 无需搜索。我们经验展示了我们对现实世界数据集的方法的功效。

translated by 谷歌翻译

Meta-learning generalizable dynamics from trajectories

Qiaofeng Li , Tianyi Wang , Vwani Roychowdhury , M. Khalid Jawed

分类：机器学习

2023-01-03

We present the interpretable meta neural ordinary differential equation (iMODE) method to rapidly learn generalizable (i.e., not parameter-specific) dynamics from trajectories of multiple dynamical systems that vary in their physical parameters. The iMODE method learns meta-knowledge, the functional variations of the force field of dynamical system instances without knowing the physical parameters, by adopting a bi-level optimization framework: an outer level capturing the common force field form among studied dynamical system instances and an inner level adapting to individual system instances. A priori physical knowledge can be conveniently embedded in the neural network architecture as inductive bias, such as conservative force field and Euclidean symmetry. With the learned meta-knowledge, iMODE can model an unseen system within seconds, and inversely reveal knowledge on the physical parameters of a system, or as a Neural Gauge to "measure" the physical parameters of an unseen system with observed trajectories. We test the validity of the iMODE method on bistable, double pendulum, Van der Pol, Slinky, and reaction-diffusion systems.

translated by 谷歌翻译

Online Real-time Learning of Dynamical Systems from Noisy Streaming Data

S. Sinha , Sai P. Nandanoori , David Barajas-Solano

分类：机器学习

2022-12-10

Recent advancements in sensing and communication facilitate obtaining high-frequency real-time data from various physical systems like power networks, climate systems, biological networks, etc. However, since the data are recorded by physical sensors, it is natural that the obtained data is corrupted by measurement noise. In this paper, we present a novel algorithm for online real-time learning of dynamical systems from noisy time-series data, which employs the Robust Koopman operator framework to mitigate the effect of measurement noise. The proposed algorithm has three main advantages: a) it allows for online real-time monitoring of a dynamical system; b) it obtains a linear representation of the underlying dynamical system, thus enabling the user to use linear systems theory for analysis and control of the system; c) it is computationally fast and less intensive than the popular Extended Dynamic Mode Decomposition (EDMD) algorithm. We illustrate the efficiency of the proposed algorithm by applying it to identify the Van der Pol oscillator, the IEEE 68 bus system, and a ring network of Van der Pol oscillators.

translated by 谷歌翻译

PRISM: Probabilistic Real-Time Inference in Spatial World Models

Atanas Mirchev , Baris Kayalibay , Ahmed Agha , Patrick van der Smagt , Daniel Cremers , Justin Bayer

分类：机器学习 | 计算机视觉 | 机器人 | (统计)机器学习

2022-12-06

We introduce PRISM, a method for real-time filtering in a probabilistic generative model of agent motion and visual perception. Previous approaches either lack uncertainty estimates for the map and agent state, do not run in real-time, do not have a dense scene representation or do not model agent dynamics. Our solution reconciles all of these aspects. We start from a predefined state-space model which combines differentiable rendering and 6-DoF dynamics. Probabilistic inference in this model amounts to simultaneous localisation and mapping (SLAM) and is intractable. We use a series of approximations to Bayesian inference to arrive at probabilistic map and state estimates. We take advantage of well-established methods and closed-form updates, preserving accuracy and enabling real-time capability. The proposed solution runs at 10Hz real-time and is similarly accurate to state-of-the-art SLAM in small to medium-sized indoor environments, with high-speed UAV and handheld camera agents (Blackbird, EuRoC and TUM-RGBD).

translated by 谷歌翻译

Adaptive Sequential Surveillance with Network and Temporal Dependence

Ivana Malenica , Jeremy R. Coyle , Mark J. van der Laan , Maya L. Petersen

分类： (统计)机器学习

2022-12-05

Strategic test allocation plays a major role in the control of both emerging and existing pandemics (e.g., COVID-19, HIV). Widespread testing supports effective epidemic control by (1) reducing transmission via identifying cases, and (2) tracking outbreak dynamics to inform targeted interventions. However, infectious disease surveillance presents unique statistical challenges. For instance, the true outcome of interest - one's positive infectious status, is often a latent variable. In addition, presence of both network and temporal dependence reduces the data to a single observation. As testing entire populations regularly is neither efficient nor feasible, standard approaches to testing recommend simple rule-based testing strategies (e.g., symptom based, contact tracing), without taking into account individual risk. In this work, we study an adaptive sequential design involving n individuals over a period of {\tau} time-steps, which allows for unspecified dependence among individuals and across time. Our causal target parameter is the mean latent outcome we would have obtained after one time-step, if, starting at time t given the observed past, we had carried out a stochastic intervention that maximizes the outcome under a resource constraint. We propose an Online Super Learner for adaptive sequential surveillance that learns the optimal choice of tests strategies over time while adapting to the current state of the outbreak. Relying on a series of working models, the proposed method learns across samples, through time, or both: based on the underlying (unknown) structure in the data. We present an identification result for the latent outcome in terms of the observed data, and demonstrate the superior performance of the proposed strategy in a simulation modeling a residential university environment during the COVID-19 pandemic.

translated by 谷歌翻译